What does my group do?
- Study the molecular basis of variation in development and disease
- Using high-throughput experimental methods
June 14, 2017
What does my group do?
What makes them different?
Much human variation is due to difference in ~6 million DNA base pairs (0.1% of genome)
What makes them different?
Genes are expressed differently during different stages and in different tissues.
DNA is packed, making certain parts inaccessible, and this packing is dynamic.
DNA methylation is a chemical modification of DNA, involved in gene expression regulation.
Large blocks of hypo-methylation in colon cancer
Genes with hyper-variable expression in colon cancer are enriched within these blocks.
Hypo-methylation blocks observed across five solid tumor types.
Gene expression hyper-variability enriched in hypo-methylation blocks in other cancer types.
Genes with consistent hyper-variable expression across tumors are tissue-specific.
Genes are expressed differently during different stages and in different tissues.
metagenomeSeqmetagenomeFeaturesantiProfilesminfibumphunterHTShapeqsmoothRcplexRcsdpCollaborative and exploratory analysis
Bsmooth, minfi)epivizr packageCreativity in exploration
We are building software applications to support creative exploratory analysis of large genome-wide datasets…
Summarization: summarize integrated measurements (computed on data subsets)
Statistically-guided exploration: Calculate a statistic of interest
# Get tumor methylation base-pair data m <- assay(se)[,"tumor"] # Compute regions with highest variability across cpgs region_stat <- calcWindowStat(m, step=25, window=80, stat=rowSds) s <- region_stat[,"stat"]
Explore data based on statistic
What's around the regions with highest across CpG variability?
# get locations in decreasing order o <- order(s, decreasing=TRUE) indices <- region_stat[o, "indices"] slideShowRegions <- rowRanges(se)[indices] + 1250000L mgr$slideshow(slideShowRegions)
dynamically extensible: Easily integrate new data types and add new visualizations.
Visualization design goals
Visualization goals
One interpretation of Big Data is Many relevant sources of contextual data
Acknowledgements
Justin Wagner, Jayaram Kancherla (CBCB)
Florin Chelaru (now at Google), Joseph Paulson (now at Genentech)
Feinberg Lab & K. Hansen (JHU), R. Irizarry (Harvard)
Funding: NIH, Genentech, Gates Foundation
More information